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Abstract 

This work studies the Zipf Law for cities in Brazil. Data from censuses of 1970, 
1980, 1991 and 2000 were used to select a sample containing only cities with 30,000 
inhabitants or more. The results show that the population distribution in Brazilian 
cities does follow a power law similar to the ones found in other countries. Estimates 
of the power law exponent were found to be 2.22 it 0.34 for the 1970 and 1980 
censuses, and 2.26 it 0.11 for censuses of 1991 and 2000. More accurate results were 
obtained with the maximum likelihood estimator, showing an exponent equal to 
2.41 for 1970 and 2.36 for the other three years. 
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1 Introduction 



It was first observed by Auerbach [1], although it is often attributed to Zipf 
[2], that the way in which urban aggregates are distributed, that is, the way 
the populations of cities are distributed, follows a power law behaviour with 
exponent a ^ 2. If we assign probabilities to this distribution the resulting 
behaviour is also a power law, known as the Zipf law. This law seems to have an 
universal character, holding at the world level [3] as well as to single nations. 
The exponent also seems to be independent of the area of the nation and the 
social and economical conditions of its population [4]. 

Power law exponents of cities have been measured in many countries. It was 
reported by [3] that 2,400 cities in the U.S.A. have a = 2.1 ± 0.1, whereas [5] 
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reported a = 2.30 ± 0.05 for the U.S.A. census of year 2000. According to [3] 
1,300 municipalities in Switzerland have a = 2.0 ± 0.1. Taking together 2,700 
cities of the world with population bigger than 100,000 inhabitants produces 
a — 2.03 ± 0.05 [3]. One should notice that those exponents were calculated 
by least squares fitting, a method known to introduce biased results if data 
is not properly handled [6] . Despite this, most results obtained so far indicate 
that the exponent seems to follow the universal value of o; ~ 2. 

Such power law behaviour seems to be the manifestation of the dynamics of 
complex systems, whose striking feature is of showing universal laws character- 
ized by exponents in scale invariant distributions that happen to be basically 
independent of the details in the microscopic dynamics. Social behaviour is an 
example of interaction of the elements of a complex systems, in this case hu- 
man beings, giving rise to cooperative evolution which in itself strongly differs 
from the individual dynamics. So, the demographic distribution of human be- 
ings on the Earth's surface, which has sharp peaks of concentrated population 
- the cities - alternated with relatively large extensions where the population 
density is much lower, follows a power law typical of complex system dynamics. 

The aim of this paper is to present empirical evidence that the population 
distribution of Brazilian cities also follows a power law with exponent close to 
the universal value. We have selected a sample from Brazil's decennial censuses 
of 1970, 1980, 1991 and 2000 and obtained probability distribution functions 
of Brazilian cities with a lower cutoff of 30,000 inhabitants. Our procedure 
took great care to avoid large statistical fluctuations at the tail in order to 
avoid introducing large biases in the determination of the exponent [5,6]. Our 
results show that Brazilian cities do follow the universal pattern: conservative 
estimates produced a = 2.22 ± 0.34 in 1970 and 1980. For 1991 and 2000 we 
obtained q; = 2.26 ±0.11. 

The paper is organized as follows. In §2 we present the data and our selection 
methodology, whereas in §3 we present the methods to analyze the data. §4 
shows the results obtained using three different techniques to calculate the 
exponent a. The paper ends with a concluding section. 



2 The Data 

Brazil is estimated to reach a population of approximately 185 million inhab- 
itants by the end of 2005, the 5th place in the ranking of the world's most 
populous countries. This population occupies over 5 thousand cities, and al- 
though most of them have very few inhabitants, 15 cities have more than one 
million people, with two of them, Sao Paulo and Rio de Janeiro, having more 
than 5 million inhabitants. In order to obtain a sample for the purposes of this 
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Table 1 

Number of cities in Brazil. 



year of census 


1970 


1980 


1991 


2000 


all cities 


3958 


3806 


4277 


5238 


cities with > 30,000 


614 


787 


905 


955 



work we need to define first of all what we mean by a city. After surveying 
the administrative way Brazil is governed we concluded that in Brazil's case 
we should equate city to municipality, defined as being the territorially small- 
est administrative subdivision of a country that has its own democratically 
elected representative leadership. This means that Brazil's entire territory is 
subdivided in municipalities, or cities. Some of them have very big 
tually bigger than many European countries, but those are usually located in 
regions very sparsely populated. 

Censuses of Brazil's entire population have been taking place for over a hun- 
dred years at a ten years hiatus since 1890. However, data in digitalized form 
is only available at IBGE, the government institution responsible for censuses, 
since 1970. Data in between censuses are obtained by very small sampling and 
extrapolation. Considering this we decided to take data only from the official, 
entire population, censuses available in digital format, namely for the years 
of 1970, 1980, 1991 and 2000. This data shows that the number of Brazilian 
municipalities has increased to over 30% from 1970 to 2000. This is clearly a 
consequence of the fact that the definition of a city is administrative, reflecting 
Brazil's internal politics, and has been varying over the last decades. 

The fact that the number of municipalities has shown a sharp increase within 
the time span of our data will not affect our study because, as mentioned 
above, most Brazilian cities have small populations and as the Brazilian con- 
cept of a city means territorial subdivision, which includes both rural and 
urban inhabitants, an examination of the data shows that cities with more 
than 30 thousand inhabitants have their population almost entirely concen- 
trated in the urban area. ^ We have, therefore, decided to include only cities 
with more than 30 thousand people in our sample, which meant a significant 
reduction of the number of the municipahties as compared to the original raw 
data (see table 1). The exclusion of the smaller cities represents in fact the 
exclusion of the rural population from our sample. In 1970 40% of Brazilians 
were living in cities with less than 30 thousand people, whereas in 2000 this 
figure was reduced to 26%. In other words, roughly speaking the percentage 
of Brazihans hving in urban areas has increased from 60% in 1970 to 74% in 
2000. 



Nowadays IBGE defines what is a rural, as opposed to an urban, area by satelhte 
imagery. See also footnote at page 5. 
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3 Data Analysis 



Once our sample is selected, we need to define our method of analysis. Here 
wc shall follow closely the methodology for fitting power law distributions and 
estimating goodness-of-fit parameters as proposed by [5] . We will start with a 
very brief introductory description of power laws statistics in order to fix the 
notation. 

Let p{x) dx be the fraction of cities with population between x and x + dx. So 
p{x) defines a certain distribution of the data x. It is useful to express this dis- 
tribution in terms of the cumulative distribution function V (x) = p{x')dx\ 
which is simply the probability that a city has a population equal to or greater 
than x. If the fraction p{x) follows a power law of the type, 

p{x) = Cx-'^, (1) 

where a and C are constants, then V{x) also follows a power law, given by 
V{x) = ^ x-^'^-^l (2) 



Such power law distributions are also known as Zipf law or Pareto distribution. 
From equation (1) it is obvious that p{x) diverges for any positive value of 
the exponent a as a; — > 0, and this means that the distribution must deviate 
from a power law below some minimum value Xmin- In other words, we can 
only assume that the distribution follows a Zipf law for x above .Xmin, and in 
this case equation (1) can be normalized as I^^^^pix') dx' = 1 to obtain the 
constant C only if x and the exponent a obey the following conditions: a > 1, 
X > Xmin- Power laws with exponents less than unity cannot be normalized 
and do not usually occur in nature [5]. The normalized constant C, given in 
terms of a and Xmin, allows us to write the power laws (1) and (2) as follows. 



lnp(a;) = —a \nx + B, 
\nV{x) = (1 -a)lnx + /3. 



(3) 
(4) 



where 

S = ln 



(3 = {a — 1) InXr, 



(5) 
(6) 



If we now define the distribution p{xi) as being the number of cities with 
population equal to or bigger than Xi, we are able to create for each sample 
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Table 2 



Distribution functions of Brazilian municipalities. 



year 


1970 


1980 


1991 


2000 


i 




P{Xi) 




P{Xi) 




P{Xi) 




P{Xi) 




1 


30,000 


614 


1 


787 


1 


905 


1 


955 


1 


2 


60,000 


187 


0.3046 


287 


0.3647 


383 


0.4232 


447 


0.4681 


3 


120,000 


67 


0.1091 


114 


0.1449 


152 


0.1680 


187 


0.1958 


4 


240,000 


26 


0.0423 


45 


0.0572 


67 


0.0740 


92 


0.0963 


5 


480,000 


10 


0.0163 


18 


0.0229 


27 


0.0298 


34 


0.0356 


6 


960,000 


5 


0.0081 


10 


0.0127 


12 


0.0133 


14 


0.0147 


7 


1,920,000 


2 


0.0033 


2 


0.0025 


4 


0.0044 


6 


0.0063 


8 


3,840,000 


2 


0.0033 


2 


0.0025 


2 


0.0022 


2 


0.0021 


9 


7,680,000 






1 


0.0013 


1 


0.0011 


1 


0.0010 



a set of n observed values {xj}, {i = 1, . . . ,n), {xi = Xmin), from where we 
can estimate a. To do so we need first of all to create histograms with the 
data once we define the step separating each set of observed values {xj}. The 
main difficulty that arises in this procedure is the large fluctuation in the 
tail, towards bins which have a far smaller number of observed values than 
at previous bins, enhancing then the statistical fluctuations [5]. In order to 
decrease such fluctuations we have taken logarithmic binning so that bins span 
at increasingly larger intervals whose steps increase exponentially according 
to the following rule, 

Xi — 2 Xjnin- (7) 



The resulting data is shown in table 2 and plotted in figures 1 and 2, where 

one can clearly sec a power law behaviour for all years. ^ The cumulative 
distribution V{xi) was obtained dividing p{xi) by the total number of cities 
with more than 30,000 inhabitants in each year when an all population census 
occurred. This means that V{xi) is the probability that a Brazilian city has 
population equal to or greater than xi (see table 1). 

^ Previous attempts made by us at plotting V{xi) vs. Xi with Xmin < 30, 000 showed 
no power law behaviour in Brazilian cities with population smaller than about 
25,000-30,000 inhabitants. So, the transition to a power law behaviour does seem 
to indicate the change between rural and urban population, that is, the transition 
from spread out human settlements to the human population aggregations we call 
cities. Hence, this cutoff in Xi can be used as the critical value that allow us to 
obtain the fractions of urban and rural populations in a country. 
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As discussed in §2 above, our samples assumed Xmin = 30, 000, which still 
leaves a to be determined. To do so we have applied three different methods to 
obtain the exponent: maximum likelihood estimator, least squares regression 
and parameter averaging (very simple bootstrap) . These three methods should 
converge to similar values of a, and, taken together, are capable to detect 
possible systematic biases into the value of the exponent, known to arise from 
simple fits from the plots (see [5,6]). One should notice that least squares fitting 
is a good method for determining the exponent of a power law distribution, 
provided the large fluctuations of the tail arising from logarithmic binning are 
significantly reduced (see [6]). 







1 1 


1 1 1 1 1 1 1 


1 1 


1 1 1 


1 1 1 1 1 


1 1 1 1 1 1 1 1 

1970 
















1980 X - 




1 


- M 














0.1 




X 




X 












) 








X 




X 



o 
X 






0.01 










o 
X 




0.001 












X 




0.0001 




1 






, , , 1 


1 


















10000 


100000 






le+06 


le+07 



Fig. 1. Graph of the cumulative distribution function V{xi) against the population 

Xi of Brazilian cities with 30,000 people or m,ore in the years of 1970 and 1980. One 
can clearly see the decaying straight line pattern of a power law behaviour with very 
little change over the time span of the sample. One can also notice some fluctuations 
at the tail of the plot, reflecting very small number of cities with large population 
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4 Results 



4-1 Maximum Likelihood Estimator 



A simple and reliable method for extracting the exponent is to employ the 
following formula discussed in [5], 



a — 1 + n 



Xi 



Xr] 



(8) 



obtained by means of the maximum likehhood estimator (MLE). The results 

are shown in tabic 3. whereas figures 3, 4, 5 and 6 show the exponent fits of 
table 3 drawn as lines for each data. 
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Fig. 2. Same graph as in the previous figure, but with data of 1991 and 2000 censuses. 
As before, one can clearly see the decaying straight line pattern of a power law 

behaviour. However, the statistical fluctuations at the tail have virtually disappeared 
as compared to the tail in figure 1, reflecting the fact that there is a bigger number 
of cities with more than one million inhabitants in Brazil from 1991 on than in the 
previous years. 
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Table 3 



Results for a. 



Method 


1970 


1980 


1991 


2000 




2,41 


2,36 


2,36 


2,36 


"lsf 


2,23 


2,23 


2,25 


2,26 


Q^PAE 


2,22 ± 0,34 


2,22 ± 0,34 


2,25 ± 0,10 


2,26 ± 0,11 



4-2 Least Squares Fitting 



As noticed above, if the large uneven variation in the tail is severely reduced, 
the possible bias introduced in determining the power law exponent by least 
squares fitting is also reduced, as discussed in [6]. In addition, we are apply- 
ing this method together with other two methodologies, giving us, therefore, 
confidence in the final results. Results of least squares fitting (LSF) are shown 
in table 3, whereas figures 3, 4, 5 and 6 show the line fits. 

4-3 Parameter Averaging Estimator 

This is in fact a very simple bootstrap estimator, where instead of taking 
many random samples we have just taken all possible combinations of two 
points, without repetition, obtained the angular coefficient a and calculated 
the average and standard deviation of all values of a. The aim was to produce 
an estimate of the error. By taking only two points we have obtained a con- 
servative estimation in the sense that more than two points would decrease 
the error. However, viewing the results of the parameter averaging estimator 
(PAE) together with the other two estimator showed us that this conservative 
method is enough for the purposes of this work. The results are also shown in 
table 3 and their line fits can be found in figures 3, 4, 5 and 6. 

4-4 Discussion 

The results obtained show that LSF and PAE estimators produced basically 
the same results, whereas all MLE derived exponents are a little higher. If 
we take MLE as the best estimator, the other two suffered a bias of 8%, 
6%, 5% and 4% for 1970, 1980, 1991 and 2000, respectively. Those biases 
are well within the error obtained with PAE estimator, showing that once 
the statistical fluctuations at the tail are successfully reduced by means of an 
appropriate logarithmic binning (appropriate choice of step and Xmin), LSF 
estimator provides a good methodology. In fact, the bias decreases from its 
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maxiniTim in 1970 to its minimum in 2000 simultaneously to a decrease in 
the statistical fluctuations at the tail in these same years, brought about by 
the introduction in the sample of more observed values at the tail due to 
the increase in the number of cities with more than a million inhabitants. 
In addition, a visual inspection of the fits in figures 3, 4, 5, 6 shows that 
MLE appears to be a better fitting methodology when statistical fluctuations 
are larger (1970 and 1980) as compared to smaller fluctuations in the data 
stemming from the 1991 and 2000 data sets. 

As an extension of our analysis it is interesting to probe why other authors 
obtain different results from the universal value of a ^ 2 for the power law 
exponent of cities, apart from the large fluctuations at the tail and LSF fltting 
mentioned above. For instance, [7] reported a ~ 1 for cities in Indonesia for the 
1961 to 1990 decennial censuses. For Indonesia's year 2000 census they found 
an exponent smaller than one (see [7], table 2). Inasmuch as we saw above 
that a normalized power law must have a > 1, a possible, and likely, cause 
for these unexpected results is the absence of, or inappropriate, Xmin deflnition 
for their samples. Then, without a proper normalization it is probable that 
their exponent estimates suffered contamination from the region of the plot 
where there is no power law behaviour. In other words, the set of observed 
values from where [7] calculated a was probably contaminated with data from 

1 1 1 1 — I — I — r-r-| 1 1 1 1 — I — I — r-r-| 1 1 1 1 — i — i — n- 
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Fig. 3. Plot ofV{xi) vs. the population Xi for 1970 data with the fits shown in table 3 
drawn as lines. Notice that LSF and PAE estimates are almost equal to one another 
and their line fits are superposed. In addition, one can also notice that MLE does 
seem to provide a better fit for data with larger statistical fluctuations at the tail. 
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small cities with few inhabitants, and which should have been removed from 
the data set used to calculate a. As seen above, finding Xmin is a critical step 
to avoid such a contamination. 

To summarize our results, conservative estimates for the exponent of the Zipf 
law in Brazilian cities are reached by taking all methods within the error 
margin. That results in a = 2.22 ± 0.34 for 1970 and 1980, and a = 2.26 ± 
0.11 for 1991 and 2000. On the other hand, accurate results come from MLE 
estimates, producing a — 2.41 for 1970 and a — 2.36 for the other years. 



5 Conclusion 

In this paper we have discussed the Zipf law in Brazilian cities. We have ob- 
tained data from censuses carried out in Brazil in the years of 1970, 1980, 
1991 and 2000 from where we selected a sample which included only cities 
with 30,000 or more inhabitants. Then we calculated the cumulative distri- 
bution function V{xi) of Brazilian cities, which gives the probability that a 
city has a population equal or bigger than Xj. We found that this distribution 
does follow a decaying power law, whose exponent a was estimated by three 
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Fig. 4. Plot ofV{xi) vs. the population Xi for 1980 data with the fits shown in table 
3 drawn as lines. As in figure 3, LSF and PAE results are almost the same, with 
their line fits being drawn on top of each other. Again, MLE seems to handle best 
the fluctuations at the tail 
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different methods: maximum lilcelihood estimator, least squares fitting and 
average parameter estimator. Our results show that a conservative estimate, 
which includes the results of all three methods, produces a = 2.22 ± 0.34 in 
1970 and 1980, and a = 2.26 ± 0.11 for 1991 and 2000. More accurate results 
are given by the maximum likelihood estimator, showing a = 2.41 for 1970 
and a = 2.36 for all other years. 
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Fig. 5. Plot ofV{xi) vs. the population Xi for 1991 data with the fits shown in table 
3 drawn as lines. LSF and PAE results are exactly the same and the exponent found 
with MLE is within the standard deviation of the PAE result. 
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Fig. 6. Plot of V{xi) vs. the population Xi for 2000 data with the fits shown in 
table 3 drawn as lines. As in figure 5, LSF and PAE results are the same and MLE 
estimate is within PAE's standard deviation. This data set is for the census with 
smallest fluctuations at the tail as compared to the previous cases of years 1970, 
1980, 1991, and where all three fitting methods show the smallest difference among 
each other (see table 3). 
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